588 research outputs found

    Data-Driven Sparse Structure Selection for Deep Neural Networks

    Full text link
    Deep convolutional neural networks have liberated its extraordinary power on various tasks. However, it is still very challenging to deploy state-of-the-art models into real-world applications due to their high computational complexity. How can we design a compact and effective network without massive experiments and expert knowledge? In this paper, we propose a simple and effective framework to learn and prune deep models in an end-to-end manner. In our framework, a new type of parameter -- scaling factor is first introduced to scale the outputs of specific structures, such as neurons, groups or residual blocks. Then we add sparsity regularizations on these factors, and solve this optimization problem by a modified stochastic Accelerated Proximal Gradient (APG) method. By forcing some of the factors to zero, we can safely remove the corresponding structures, thus prune the unimportant parts of a CNN. Comparing with other structure selection methods that may need thousands of trials or iterative fine-tuning, our method is trained fully end-to-end in one training pass without bells and whistles. We evaluate our method, Sparse Structure Selection with several state-of-the-art CNNs, and demonstrate very promising results with adaptive depth and width selection.Comment: ECCV Camera ready versio

    Food Ingredients Recognition through Multi-label Learning

    Full text link
    Automatically constructing a food diary that tracks the ingredients consumed can help people follow a healthy diet. We tackle the problem of food ingredients recognition as a multi-label learning problem. We propose a method for adapting a highly performing state of the art CNN in order to act as a multi-label predictor for learning recipes in terms of their list of ingredients. We prove that our model is able to, given a picture, predict its list of ingredients, even if the recipe corresponding to the picture has never been seen by the model. We make public two new datasets suitable for this purpose. Furthermore, we prove that a model trained with a high variability of recipes and ingredients is able to generalize better on new data, and visualize how it specializes each of its neurons to different ingredients.Comment: 8 page

    MTDeep: Boosting the Security of Deep Neural Nets Against Adversarial Attacks with Moving Target Defense

    Full text link
    Present attack methods can make state-of-the-art classification systems based on deep neural networks misclassify every adversarially modified test example. The design of general defense strategies against a wide range of such attacks still remains a challenging problem. In this paper, we draw inspiration from the fields of cybersecurity and multi-agent systems and propose to leverage the concept of Moving Target Defense (MTD) in designing a meta-defense for 'boosting' the robustness of an ensemble of deep neural networks (DNNs) for visual classification tasks against such adversarial attacks. To classify an input image, a trained network is picked randomly from this set of networks by formulating the interaction between a Defender (who hosts the classification networks) and their (Legitimate and Malicious) users as a Bayesian Stackelberg Game (BSG). We empirically show that this approach, MTDeep, reduces misclassification on perturbed images in various datasets such as MNIST, FashionMNIST, and ImageNet while maintaining high classification accuracy on legitimate test images. We then demonstrate that our framework, being the first meta-defense technique, can be used in conjunction with any existing defense mechanism to provide more resilience against adversarial attacks that can be afforded by these defense mechanisms. Lastly, to quantify the increase in robustness of an ensemble-based classification system when we use MTDeep, we analyze the properties of a set of DNNs and introduce the concept of differential immunity that formalizes the notion of attack transferability.Comment: Accepted to the Conference on Decision and Game Theory for Security (GameSec), 201

    DeeSIL: Deep-Shallow Incremental Learning

    Full text link
    Incremental Learning (IL) is an interesting AI problem when the algorithm is assumed to work on a budget. This is especially true when IL is modeled using a deep learning approach, where two com- plex challenges arise due to limited memory, which induces catastrophic forgetting and delays related to the retraining needed in order to incorpo- rate new classes. Here we introduce DeeSIL, an adaptation of a known transfer learning scheme that combines a fixed deep representation used as feature extractor and learning independent shallow classifiers to in- crease recognition capacity. This scheme tackles the two aforementioned challenges since it works well with a limited memory budget and each new concept can be added within a minute. Moreover, since no deep re- training is needed when the model is incremented, DeeSIL can integrate larger amounts of initial data that provide more transferable features. Performance is evaluated on ImageNet LSVRC 2012 against three state of the art algorithms. Results show that, at scale, DeeSIL performance is 23 and 33 points higher than the best baseline when using the same and more initial data respectively

    Adversarial attacks hidden in plain sight

    Full text link
    Convolutional neural networks have been used to achieve a string of successes during recent years, but their lack of interpretability remains a serious issue. Adversarial examples are designed to deliberately fool neural networks into making any desired incorrect classification, potentially with very high certainty. Several defensive approaches increase robustness against adversarial attacks, demanding attacks of greater magnitude, which lead to visible artifacts. By considering human visual perception, we compose a technique that allows to hide such adversarial attacks in regions of high complexity, such that they are imperceptible even to an astute observer. We carry out a user study on classifying adversarially modified images to validate the perceptual quality of our approach and find significant evidence for its concealment with regards to human visual perception

    Recycle-GAN: Unsupervised Video Retargeting

    Full text link
    We introduce a data-driven approach for unsupervised video retargeting that translates content from one domain to another while preserving the style native to a domain, i.e., if contents of John Oliver's speech were to be transferred to Stephen Colbert, then the generated content/speech should be in Stephen Colbert's style. Our approach combines both spatial and temporal information along with adversarial losses for content translation and style preservation. In this work, we first study the advantages of using spatiotemporal constraints over spatial constraints for effective retargeting. We then demonstrate the proposed approach for the problems where information in both space and time matters such as face-to-face translation, flower-to-flower, wind and cloud synthesis, sunrise and sunset.Comment: ECCV 2018; Please refer to project webpage for videos - http://www.cs.cmu.edu/~aayushb/Recycle-GA

    Hierarchical ResNeXt Models for Breast Cancer Histology Image Classification

    Full text link
    Microscopic histology image analysis is a cornerstone in early detection of breast cancer. However these images are very large and manual analysis is error prone and very time consuming. Thus automating this process is in high demand. We proposed a hierarchical system of convolutional neural networks (CNN) that classifies automatically patches of these images into four pathologies: normal, benign, in situ carcinoma and invasive carcinoma. We evaluated our system on the BACH challenge dataset of image-wise classification and a small dataset that we used to extend it. Using a train/test split of 75%/25%, we achieved an accuracy rate of 0.99 on the test split for the BACH dataset and 0.96 on that of the extension. On the test of the BACH challenge, we've reached an accuracy of 0.81 which rank us to the 8th out of 51 teams

    Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks

    Full text link
    Learning deeper convolutional neural networks becomes a tendency in recent years. However, many empirical evidences suggest that performance improvement cannot be gained by simply stacking more layers. In this paper, we consider the issue from an information theoretical perspective, and propose a novel method Relay Backpropagation, that encourages the propagation of effective information through the network in training stage. By virtue of the method, we achieved the first place in ILSVRC 2015 Scene Classification Challenge. Extensive experiments on two challenging large scale datasets demonstrate the effectiveness of our method is not restricted to a specific dataset or network architecture. Our models will be available to the research community later.Comment: Technical report for our submissions to the ILSVRC 2015 Scene Classification Challenge, where we won the first plac

    Memory Aware Synapses: Learning what (not) to forget

    Full text link
    Humans can learn in a continuous manner. Old rarely utilized knowledge can be overwritten by new incoming information while important, frequently used knowledge is prevented from being erased. In artificial learning systems, lifelong learning so far has focused mainly on accumulating knowledge over tasks and overcoming catastrophic forgetting. In this paper, we argue that, given the limited model capacity and the unlimited new information to be learned, knowledge has to be preserved or erased selectively. Inspired by neuroplasticity, we propose a novel approach for lifelong learning, coined Memory Aware Synapses (MAS). It computes the importance of the parameters of a neural network in an unsupervised and online manner. Given a new sample which is fed to the network, MAS accumulates an importance measure for each parameter of the network, based on how sensitive the predicted output function is to a change in this parameter. When learning a new task, changes to important parameters can then be penalized, effectively preventing important knowledge related to previous tasks from being overwritten. Further, we show an interesting connection between a local version of our method and Hebb's rule,which is a model for the learning process in the brain. We test our method on a sequence of object recognition tasks and on the challenging problem of learning an embedding for predicting triplets. We show state-of-the-art performance and, for the first time, the ability to adapt the importance of the parameters based on unlabeled data towards what the network needs (not) to forget, which may vary depending on test conditions.Comment: ECCV 201

    Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian Processes

    Get PDF
    We propose a novel method for neural network quantization that casts the neural architecture search problem as one of hyperparameter search to find non-uniform bit distributions throughout the layers of a CNN. We perform the search assuming a Multi-Task Gaussian Processes prior, which splits the problem to multiple tasks, each corresponding to different number of training epochs, and explore the space by sampling those configurations that yield maximum information. We then show that with significantly lower precision in the last layers we achieve a minimal loss of accuracy with appreciable memory savings. We test our findings on the CIFAR10 and ImageNet datasets using the VGG, ResNet and GoogLeNet architectures.Comment: Accepted for publication at ECCV 2020. Code availiable at https://code.active.vision . Updated for typ
    corecore